central agent
Adaptive Federated Learning via Dynamical System Model
Agarwal, Aayushya, Pileggi, Larry, Joshi, Gauri
Hyperparameter selection is critical for stable and efficient convergence of heterogeneous federated learning, where clients differ in computational capabilities, and data distributions are non-IID. Tuning hyperparameters is a manual and computationally expensive process as the hyperparameter space grows combinatorially with the number of clients. To address this, we introduce an end-to-end adaptive federated learning method in which both clients and central agents adaptively select their local learning rates and momentum parameters. Our approach models federated learning as a dynamical system, allowing us to draw on principles from numerical simulation and physical design. Through this perspective, selecting momentum parameters equates to critically damping the system for fast, stable convergence, while learning rates for clients and central servers are adaptively selected to satisfy accuracy properties from numerical simulation. The result is an adaptive, momentum-based federated learning algorithm in which the learning rates for clients and servers are dynamically adjusted and controlled by a single, global hyperparameter. By designing a fully integrated solution for both adaptive client updates and central agent aggregation, our method is capable of handling key challenges of heterogeneous federated learning, including objective inconsistency and client drift. Importantly, our approach achieves fast convergence while being insensitive to the choice of the global hyperparameter, making it well-suited for rapid prototyping and scalable deployment. Compared to state-of-the-art adaptive methods, our framework is shown to deliver superior convergence for heterogeneous federated learning while eliminating the need for hyperparameter tuning both client and server updates.
- North America > United States > Virginia (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Subsidy design for better social outcomes
Balcan, Maria-Florina, Pozzi, Matteo, Sharma, Dravyansh
Overcoming the impact of selfish behavior of rational players in multiagent systems is a fundamental problem in game theory. Without any intervention from a central agent, strategic users take actions in order to maximize their personal utility, which can lead to extremely inefficient overall system performance, often indicated by a high Price of Anarchy. Recent work (Lin et al. 2021) investigated and formalized yet another undesirable behavior of rational agents, that of avoiding freely available information about the game for selfish reasons, leading to worse social outcomes. A central planner can significantly mitigate these issues by injecting a subsidy to reduce certain costs associated with the system and obtain net gains in the system performance. Crucially, the planner needs to determine how to allocate this subsidy effectively. We formally show that designing subsidies that perfectly optimize the social good, in terms of minimizing the Price of Anarchy or preventing the information avoidance behavior, is computationally hard under standard complexity theoretic assumptions. On the positive side, we show that we can learn provably good values of subsidy in repeated games coming from the same domain. This data-driven subsidy design approach avoids solving computationally hard problems for unseen games by learning over polynomially many games. We also show that optimal subsidy can be learned with no-regret given an online sequence of games, under mild assumptions on the cost matrix. Our study focuses on two distinct games: a Bayesian extension of the well-studied fair cost-sharing game, and a component maintenance game with engineering applications.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (2 more...)
- Leisure & Entertainment > Games (0.86)
- Health & Medicine > Therapeutic Area > Oncology (0.45)
Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning
Zhang, Junkai, Zhang, Yifan, Zhang, Xi Sheryl, Zang, Yifan, Cheng, Jian
Efficient collaboration in the centralized training with decentralized execution (CTDE) paradigm remains a challenge in cooperative multi-agent systems. We identify divergent action tendencies among agents as a significant obstacle to CTDE's training efficiency, requiring a large number of training samples to achieve a unified consensus on agents' policies. This divergence stems from the lack of adequate team consensus-related guidance signals during credit assignments in CTDE. To address this, we propose Intrinsic Action Tendency Consistency, a novel approach for cooperative multi-agent reinforcement learning. It integrates intrinsic rewards, obtained through an action model, into a reward-additive CTDE (RA-CTDE) framework. We formulate an action model that enables surrounding agents to predict the central agent's action tendency. Leveraging these predictions, we compute a cooperative intrinsic reward that encourages agents to match their actions with their neighbors' predictions. We establish the equivalence between RA-CTDE and CTDE through theoretical analyses, demonstrating that CTDE's training process can be achieved using agents' individual targets. Building on this insight, we introduce a novel method to combine intrinsic rewards and CTDE. Extensive experiments on challenging tasks in SMAC and GRF benchmarks showcase the improved performance of our method.
Privacy-Aware Data Acquisition under Data Similarity in Regression Markets
Pandey, Shashi Raj, Pinson, Pierre, Popovski, Petar
Data markets facilitate decentralized data exchange for applications such as prediction, learning, or inference. The design of these markets is challenged by varying privacy preferences as well as data similarity among data owners. Related works have often overlooked how data similarity impacts pricing and data value through statistical information leakage. We demonstrate that data similarity and privacy preferences are integral to market design and propose a query-response protocol using local differential privacy for a two-party data acquisition mechanism. In our regression data market model, we analyze strategic interactions between privacy-aware owners and the learner as a Stackelberg game over the asked price and privacy factor. Finally, we numerically evaluate how data similarity affects market participation and traded data value. A. Context and Motivation In recent years, there has been a surge in Internet of Things (IoT) devices with sensing and computing capabilities, leading to an abundance of IoT data. Shashi Raj Pandey and Petar Popovski are with the Connectivity Section, Department of Electronic Systems, Aalborg University, Denmark. Pierre Pinson has primary affiliation with Dyson School of Design Engineering, Imperial College London, UK. He is also affiliated to the Technical University of Denmark, Department of Technology, Management and Economics, as well as with Halfspace This work was supported by the Villum Investigator Grant "WATER" from the Velux Foundation, Denmark.
- Europe > United Kingdom > England > Greater London > London (0.34)
- Europe > Denmark > North Jutland > Aalborg (0.24)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (7 more...)
- Personal (0.93)
- Research Report (0.64)
- Information Technology > Security & Privacy (1.00)
- Transportation > Ground > Road (0.93)
- Transportation > Passenger (0.68)
Bayesian Regression Markets
Falconer, Thomas, Kazempour, Jalal, Pinson, Pierre
Data is the lifeblood of machine learning, yet for many firms, obtaining datasets of sufficient quality remains a challenge, with them being naturally distributed amongst owners with heterogeneous characteristics (e.g., privacy preferences). This has motivated several developments in the field of collaborative analytics, also known as federated learning (Figure 1a), where models are trained on local servers without the need for data centralization, thereby preserving privacy and distributing the computational burden (Kairouz et al., 2019). However, this framework provides only an incentive-free means for data sharing, relying on the critical assumption that owners are willing to collaborate (i.e., by sharing their private information) altruistically. This rather strong assumption may be violated if owners are competitors in a downstream market environment (Gal-Or, 1985). Consequently, a fruitful area of research has emerged that proposes to instead commoditize data within a market-based framework, where compensation (e.g., remuneration) can be used as an incentive for collaboration (Bergemann and Bonatti, 2019).
- Europe > Middle East > Cyprus (0.04)
- Europe > Greece (0.04)
- Asia > Middle East > Republic of Türkiye (0.04)
- (7 more...)
- Energy > Power Industry (0.92)
- Banking & Finance (0.88)
- Energy > Renewable > Solar (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph-Attention
Yang, Min, Liu, Guanjun, Zhou, Ziyuan
Traditional multi-agent reinforcement learning algorithms are difficultly applied in a large-scale multi-agent environment. The introduction of mean field theory has enhanced the scalability of multi-agent reinforcement learning in recent years. This paper considers partially observable multi-agent reinforcement learning (MARL), where each agent can only observe other agents within a fixed range. This partial observability affects the agent's ability to assess the quality of the actions of surrounding agents. This paper focuses on developing a method to capture more effective information from local observations in order to select more effective actions. Previous work in this field employs probability distributions or weighted mean field to update the average actions of neighborhood agents, but it does not fully consider the feature information of surrounding neighbors and leads to a local optimum. In this paper, we propose a novel multi-agent reinforcement learning algorithm, Partially Observable Mean Field Multi-Agent Reinforcement Learning based on Graph--Attention (GAMFQ) to remedy this flaw. GAMFQ uses a graph attention module and a mean field module to describe how an agent is influenced by the actions of other agents at each time step. This graph attention module consists of a graph attention encoder and a differentiable attention mechanism, and this mechanism outputs a dynamic graph to represent the effectiveness of neighborhood agents against central agents. The mean--field module approximates the effect of a neighborhood agent on a central agent as the average effect of effective neighborhood agents. We evaluate GAMFQ on three challenging tasks in the MAgents framework. Experiments show that GAMFQ outperforms baselines including the state-of-the-art partially observable mean-field reinforcement learning algorithms.
- Europe > United Kingdom (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (4 more...)
- Research Report (0.82)
- Overview (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Safety Embedded Stochastic Optimal Control of Networked Multi-Agent Systems via Barrier States
Song, Lin, Zhao, Pan, Wan, Neng, Hovakimyan, Naira
This paper presents a novel approach for achieving safe stochastic optimal control in networked multi-agent systems (MASs). The proposed method incorporates barrier states (BaSs) into the system dynamics to embed safety constraints. To accomplish this, the networked MAS is factorized into multiple subsystems, and each one is augmented with BaSs for the central agent. The optimal control law is obtained by solving the joint Hamilton-Jacobi-Bellman (HJB) equation on the augmented subsystem, which guarantees safety via the boundedness of the BaSs. The BaS-based optimal control technique yields safe control actions while maintaining optimality. The safe optimal control solution is approximated using path integrals. To validate the effectiveness of the proposed approach, numerical simulations are conducted on a cooperative UAV team in two different scenarios.
- Overview (0.88)
- Research Report (0.70)
Information Preferences of Individual Agents in Linear-Quadratic-Gaussian Network Games
We consider linear-quadratic-Gaussian (LQG) network games in which agents have quadratic payoffs that depend on their individual and neighbors' actions, and an unknown payoff-relevant state. An information designer determines the fidelity of information revealed to the agents about the payoff state to maximize the social welfare. Prior results show that full information disclosure is optimal under certain assumptions on the payoffs, i.e., it is beneficial for the average individual. In this paper, we provide conditions based on the strength of the dependence of payoffs on neighbors' actions, i.e., competition, under which a rational agent is expected to benefit, i.e., receive higher payoffs, from full information disclosure. We find that all agents benefit from information disclosure for the star network structure when the game is symmetric and submodular or supermodular. We also identify that the central agent benefits more than a peripheral agent from full information disclosure unless the competition is strong and the number of peripheral agents is small enough. Despite the fact that all agents expect to benefit from information disclosure ex-ante, a central agent can be worse-off from information disclosure in many realizations of the payoff state under strong competition, indicating that a risk-averse central agent can prefer uninformative signals ex-ante.
- North America > United States > Texas > Brazos County > College Station (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
HAMMER: Multi-Level Coordination of Reinforcement Learning Agents via Learned Messaging
Gupta, Nikunj, Srinivasaraghavan, G, Mohalik, Swarup Kumar, Taylor, Matthew E.
Cooperative multi-agent reinforcement learning (MARL) has achieved significant results, most notably by leveraging the representation learning abilities of deep neural networks. However, large centralized approaches quickly become infeasible as the number of agents scale, and fully decentralized approaches can miss important opportunities for information sharing and coordination. Furthermore, not all agents are equal - in some cases, individual agents may not even have the ability to send communication to other agents or explicitly model other agents. This paper considers the case where there is a single, powerful, central agent that can observe the entire observation space, and there are multiple, low powered, local agents that can only receive local observations and cannot communicate with each other. The job of the central agent is to learn what message to send to different local agents, based on the global observations, not by centrally solving the entire problem and sending action commands, but by determining what additional information an individual agent should receive so that it can make a better decision. After explaining our MARL algorithm, hammer, and where it would be most applicable, we implement it in the cooperative navigation and multi-agent walker domains. Empirical results show that 1) learned communication does indeed improve system performance, 2) results generalize to multiple numbers of agents, and 3) results generalize to different reward structures.
- Asia > India > Karnataka > Bengaluru (0.05)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Asia > Middle East > Republic of Türkiye > Manisa Province > Manisa (0.04)
Decentralized Bayesian Learning over Graphs
Lalitha, Anusha, Wang, Xinghan, Kilinc, Osman, Lu, Yongxi, Javidi, Tara, Koushanfar, Farinaz
We propose a decentralized learning algorithm over a general social network. The algorithm leaves the training data distributed on the mobile devices while utilizing a peer to peer model aggregation method. The proposed algorithm allows agents with local data to learn a shared model explaining the global training data in a decentralized fashion. The proposed algorithm can be viewed as a Bayesian and peer-to-peer variant of federated learning in which each agent keeps a "posterior probability distribution" over a global model parameters. The agent update its "posterior" based on 1) the local training data and 2) the asynchronous communication and model aggregation with their 1-hop neighbors. This Bayesian formulation allows for a systematic treatment of model aggregation over any arbitrary connected graph. Furthermore, it provides strong analytic guarantees on converge in the realizable case as well as a closed form characterization of the rate of convergence. We also show that our methodology can be combined with efficient Bayesian inference techniques to train Bayesian neural networks in a decentralized manner. By empirical studies we show that our theoretical analysis can guide the design of network/social interactions and data partitioning to achieve convergence.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Virginia (0.04)
- (3 more...)
- Education (0.67)
- Information Technology > Services (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)